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1.  Information  storage  and  retrieval.  The  problem  of  information 
storage  and  retrieval  is  a companion  of  any  civilization.  Any  civilized 
society  likes  to  preserve  its  knowledge  and  information  for  its  future 
generations.  In  fact  to  ensure  that  such  information  can  be  used  by  the 
posterity,  it  has. to  be  classified  and  stored  properly.  Today's  knowl- 
edge will  be  available  tomorrow  only  if  we  can  store  it  in  a manner  that 
will  permit  ready  retrieval.  As  the  pool  of  human  knowledge  grows,  re- 
trieval of  pertinent  information  becomes  more  and  more  difficult.  If 
information  is  not  stored  efficiently,  it  may  turn  out  to  be  easier  to 
rediscover  information  on  a certain  item  than  to  retrieve  it.  However 
this  need  not  be  the  case.  Mathematical  techniques  can  be  used  to  devise 
efficient  information  storage  and  retrieval  systems  that  allow  for  quick 
retrieval  of  information  pertinent  to  a given  query.  The  problem  of 
storing  information  in  a computer  has  many  combinatorial  aspects.  Jn  < - 
this  paper  vie-4ry  to  develop  some  efficient  information  storage  and  re- 
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trieval  systems  by  using  methods  of  Combinatorial  Mathematics.  Combi- 
natorial configurations  have  been  used  for  constructing  filing  systems , 
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by  Abraham,  Ghose  and  Ray-Chaudhuri  [1],  Ray-Chaudhuri  [3]  and  Bose  [2], 
2.  Description  of  a file.  Any  information  system  (also  called  a 
filing  system)  is  concerned  with  a large  collection  of  units  (also  called 


items,  individuals,  documents)  and  information  about  these  units  on  a 

number  of  variables.  These  variables  might  be  called  the  informatibti 

variables.  The  totality  of  the  information  variables,  the  collection  of 

\inits  and  information  about  them  constitutes  a file.  To  give  an  example 

of  a file,  there  may  be  a file  for  all  the  pilots  of  the  Air  Force  of  a 

country.  The  units  in  such  a file  are  the  pilots  and  some  of  the  infor- 

\ 

mation  variables  may  be  (l)  the  age  of  the  pilot,  (2)  the  number  of 
combat  missions  flown  by  the  pilot,  (3)  whether  or  not  the  pilot  is  a 
veteran  of  a past  war  and  (b)  whether  or  not  the. pilot  is  married.  In 
a file  for  the  employees  of  a company,  units  are  the  employees.  Some  of 
the  information  variables  might  be  (a)  whether  or  not  the  employee 
worked  for  the  research  division  of  the  company,  (b)  whether  or  not  the 
employee  worked  for  more  than  5 years,  (c)  whether  or  not  the  employee 
published  more  than  5 scientific  papers  and  (d)  whether  or  not  the  em- 
ployee is  married.  In  a file  for  research  publications  on  copper  units 
are  the  research  papers.  Some  of  the  information  variables  could  be 
(a)  whether  or  not  the  paper  is  relevant  for  aluminium  alloys,  (b)  whether 
or  not  the  paper  is  pertinent  to  cryogenics,  (c)  whether  or  not  the  paper 
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variables  are  binary  and  they  are  variously  called  as  attributes,  terms, 
deseriptors,  clue  words,  or  locators.  If  there  is  a nonbinary  infor- 
mation variable,  it  is  possible  to  replace  it  by  artificially  created 
binary  valued  information  variables.  One  of  the  most  important  re- 
trieval problems  is  the  following;  We  are  given  a subset  of  the  set  of 
all  attributes  and  we  like  to  know  the  set  of  individuals  who  satisfy 
the  given  subset  of  attributes.  For  instance  in  our  example  of  the  file 
for  the  employees  of  a company,  we  might  like  to  find  out  the  subcol- 
lection of  employees  who  are  married  and  published  more  than  5 scientific 

\ 

papers.  In  the  file  for  research  publications  on  copper,  we  might  like 
to  find  out  all  research  publications  on  copper  which  deal  with  aluminium 

I 

1 alloys  as  well  as  cryogenic  methods.  In  a file  usually  each  individual 

is  given  an  identification  which  may  be  a name  or  a serial  number.  The 
identification  of  the  individual  and  the  values  of  the  information  vari- 

f 

i ables  for  the  individual  constitutes  his  record. 

’ Large  files  nowadays  are  stored  in  computers.  In  most  computerized 

i 

j filing  systems  the  records  are  stored  in  some  comparatively  slow  permanent 

f memory,  for  instance,  in  a tape.  The  address  of  the  permanent  memory  for 

a record  is  called  the  accession  number  of  the  record.  Obviously,  the 
accession  number  is  usually  much  smaller  in  size  than  the  complete  record. 

■» 

A set  of  addresses  of  the  comparatively  faster  memory  is  reserved  for 
storing  the  accession  numbers. 

' 3.  Formal  definitions  of  a file,  storage  rule  and  retrieval  rule 

and  inverted  filing  system.  For  any  set  X , P(X)  will  denote  the  set 
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of  subsets  of  X and  IXI  will  denote  the  cardinality  of  X . A file 
F is  a triple  (l  , A , f ) where  I is  the  set  of  individuals  (also 
called  documents  or  items),  A is  a set  of  attributes  (also  called 
descriptors  or  locators)  and  f is  a mapping  from  I to  P(A)  . For 
i € I , f(i)  is  the  set  of  attributes  possessed  by  the  individual  i . 

If  A is  a set  of  v attributes,  f(i)  can  be  represented  as  a binary 
v-tuple.  A computerized  file  is  a pair  (F  , a)  where  F is  a file  and 
a is  an  injective  mapping  from  I to  integers.  For  i £ I , a(i)  is 
called  the  accession  number  of  i . The  accession  number  is  usually  the 

V 

address  of  the  permanent  computer  memory  location  where  the  identification 
of  the  individual  i and  the  corresponding  record  f(i)  is  stored.  For 
the  sake  of  brevity,  let  I*  = fa(i);  i f l)  . I*  is  the  set  of  acces- 
sion numbers  of  the  items  of  the  file.  A storage  rule  for  the  file  F 
is  a tuple  (l  , M,  s)  where  I is  the  set  of  individuals  of  the  file 
F , M is  a set  of  fast  computer  memory  locations  and  s is  a mapping 
from  1 to  P(M)  such  that  for  i,j€I>i?^j>  s(i)  and  s(j) 
have  no  element  in  common.  The  accession  number  of  the  individual  i is 
stored  at  each  of  the  memory  locations  belonging  to  s(i)  . If  for  each 
i C I , s(i)  contains  only  one  element,  we  say  that  the  storage  rule 
admits  no  redundancy.  For  A'  c A , let  l(A’)  = {i;i  € I,  f(i)  3 A' ) . 

A retrieval  jrule  is  a triple  (c7,  I*  , r)  where  is  a subset  of  P(A) 
and  r is  a mapping  from  to  P(l*)  such  that 

V A'  ea,  r(A')  = (a(i):  1 € l(A'))  (l) 

The  file  F , together  with  the  accession  numbers,  storage  rule  and 
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retrieval  rule  constitutes  an  information  system,  (also  called  a filing 

system).  The  elements  of  (2  are  c.allcd  queries.  The  equation  (l) 

requires  that  for  any  query  A'  and  i € l(A' ) , r(A' ) contains  the 

accession  number  of  each  item  i vhich  satisfies  all  the  attributes  of 

a'  , The  most  prevalent  computerized  filing  system  is  the  inverted 

filing  system  (also  cnLlcd  coordinate  indexing).  Gtorage  rules  and 

retrieval  rules  of  inverted  filing  system  can  be  described  as  foUov/s. 

Let  A = (a^  , ...,  a^)  be  a set  of  v attributes.  The  set  of 

storage  addresses  M is  partitioned  into  v disjoint  subsets  M , 

\ 

M ...,  M called  budgets.  An  individual's  accession  number  will 
a«  a 

2 V 

be  stored  in  the  bucket  M if  and  only  if  the  individual  satisfies 

J 

the  attribute  a.  , 1 < j < v . In  other  words  s has  the  property 

that  s(i)  n M contains  one  element  if  and  only  if  a e f(i)  . For 

.^J  ^ ^ 

a £ A , define  = (a(i);  s(i)  n 0 , i fl)  . In  other  wo.  ds 

M is  the  set  of  accession  numbers  stored  in  the  memory  locations  M . 

& cl 

The  retrieval  rule  is  also  easily  described  for  a query  A'  . We 

take  r(A' ) = n M*  . Clearly  for  any  item  i which  possesses  all  the 
afA'  ® 

attributes  of  A' , r(A' ) will  contain  the  accession  number  of  i . 

If  the  query  consists  of  a single  attribute  a , then  r({a))  = M*  . 

For  queries  involving  only  one  attribute,  retrieval  can  be  done  very 
efficiently  in  the  inverted  filing  system.  However  if  the  query  involves 
more  than  one  attribute,  then  it  is  necessary  to  intersect  several  of 
the  buckets  M*  and  retrieval  can  be  very  slow  for  a large  file.  To 

£l 

overcome  this  difficulty  of  inverted  filing  system,  we  introduce  the 
principle  of  local  structuring. 


r 
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!«•.  Locally  structured  inforination  systems.  In  most  larce  files 

most  of  the  items  satisfy  only  a small  fraction  of  the  descriptors. 

This  important  fact  is  exploited  by  the  principle  of  local  structuring. 

Local  structuring  partitions  a large  file  into  several  small  files 

such  that  to  retrieve  the  items  for  a given  query,  we  need  to  search 

only  a few  of  these  small  files.  Let  F = (l  , A , f ) be  a file.  Let 

c P(A)  be  the  set  of  queries  of  interest.  Let  ^ = (l-j^  > I2  > •••> 

bo  a b-tuplc  of  svibcetn  of  I such  that  I 0 I,  =0  (the  empty  set), 

1) 

for  all  J / h , J , k « 1 , ...  b and  U I.  1 . In  otln.T  words  If  In 

\ ^ 

an  ordered  partition  of  b . Let  6 = (A^  , A^ > ...,  A^)  be  a b-tuple 
of  subsets  of  A . For  l'  c I , A*  c A,  l'(A*)  denotes  the  set  of  items 
i which  satisfy  f(i)  2 A*.  Let  J = {1,2,  .,,,b)  and  for  A*  c A , 
J^,  = (j;  j € , a'  c a j ) . The  pair  (TT  , 6)  is  said  to  be  a local 

structuring  for  (F  , i?)  if  and  only  if 

V A'  , I(A')  = U Ij(A') 


J6Ja. 

and  V i 6 Ij  , f(i)  c A^  , 1 < j < b . 

The  local  structuring  i is  said  to  have  the  equicardinality  property 

if  and  only  if  II^I  = llgl  = ...  = ll^l  . Furthermore,  i is  said  to 

be  uniform  if  and  only  if  |A^1  = Ia^I  = ...  = lAj^l  . Let  f^  be  the 

restriction  of  f to  I.  , i.e.  f .(i)  = f(i)  , for  all  i € I.  . The 

J J J 

files  F.  = (l . ,A.  ,f.  ),l<j  <b  are  called  the  local  files.  Let 
J J U J 

Mj,l<j<b  be  b disjoint  sets  of  computer  memory  locations,  Sj  be 
a storage  rule  for  the  local  file  F^  and  r^  be  a retrieval  rule  for 
the  local  file  Fj  , 1 < J < b . We  define  storage  rule  s and  retrieval 


(2) 
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rule  r for  F by 

s(i)  = Sj(r)  if  i€l^,l<j<b 

r(A)  = U r (A')  , V A’  6 


(3) 


Since  l(A*)  = U , r is  easily  seen  to  be  a retrieval  rule 

J€J^ 

for  the  file  F with  respect  to  the  set  of  queries  <2  . 

Let  Jt  = (l)  and  6 = (A)  with  b = 1 . Clearly  the  pair  f,  = 

o o o 

(TT^  , 6^)  satisfies  the  requirements  of  local  structuring  and  will  be 
called  the  trivial  local  s'tructuring.  The  principle  of  local  structuring 
opens  up  a new  dimension  for  methods  of  file  organization. 

The  central  problem  to  a builder  of  a filing  system  will  be  that  of 
determining  the  optimum  local  structuring  for  a given  file.  The  opti- 
mality criterion  itself  will  vary  from  situation  to  situation.  In  most 
large  files,  the  majority  of  the  item  satisfies  only  a small  fraction  of 
the  descriptors  of  the  file.  Suppose  v is  the  total  number  of  descrip- 
tors. Without  much  loss  in  generality  one  can  postulate  the  existence 
of  a positive  integer  t much  smaller  than  v such  that  no  item  satisfies 
more  than  t descriptors.  A few  items  which  satisfy  more  than  t attri- 
butes can  be  grouped  together  into  a small  file.  In  the  next  section  we 
show  that  under  these  assumptions  one  can  use  combinatorial  configurations 
to  construct  a local  structuring  i which  is  better  than  the  trivial 
local  structuring.  In  this  paper  we  only  want  to  emphasize  the  principle 
of  local  structuring.  We  want  to  point  out  this  alternative  to  the  plan- 
ners and  builders  of  information  systems.  The  problem  of  determining  the 
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optimum  local  ctructurinc  for  a given  file  is  a complicated  and  diffi- 
cult problem.  Local  structuring  will  give  maximum  advantage  if  all  the 
local  files  are  relatively  small  and  the  number  of  local  files  is  not 
too  large.  Finally  for  every  query  A* , the  cardinality  of  the  set  J^i 
should  be  small.  One  could  safely  make  the  claim  that  in  almost  all 
practical  situations  there  will  exist  a local  structuring  which  is  better 
than  the  trivial  local  structuring. 

5.  Local  structuring  based  on  combinatorial  configurations.  Let 
V , k.  , t and  b be  positive  integers  such  that  t < k < v , Let  A be 
a set  of  v elements  and  ^(3  be  a class  of  subsets  of  A . Members  of 
Cl  are  called  queries.  A combinatorial  configuration  with  parameters 
(A  , k , , b)  is  a pair  (A  , 55)  with  8 = {A^  j , . . .,  Aj^)  c P(A) 

such  tlmt 

(1)  |Aj(  < k , 1 < j < b 

and  (2)  V A*  € , there  exists  at  least  one  block  Aj  such  that 

Aj  2 A'  • 

The  subsets  A^ , A2  ^ ..«>  A^^  are  called  blocks.  If  Cl  is  "the  class  of 
subsets  of  cardinality  not  greater  than  t , then  the  configuration  is 
called  a (v  , k , t , b)  - configuration.  Let  b(v  , k , t)  denote  the 
smallest  integer  b such  that  a (v  , k , t , b)  - configuration  exists. 

A (v , k , t , b)  - configuration  is  said  to  be  optimum  if  and  only  if 
b = b(v  , k , t)  . We  can  use  (v  , k , t , b)  - configurations  to  construct 
a locally  structured  filing  system  for  a file  F = (l  , A , f ) provided 
no  item  satisfies  more  than  t descriptors.  Let  A^^ , Ag  , ...,  A^  be 
the  blocks  of  the  (v  , k , t , b)  - configuration.  Let  6 = (Aj^  > A^  , P^) 


ivt. 


r 


Let  I ' = {i:  f(i)  c A.)  , 1 < j < b . Let  TT  be  an  ordered  partition 
J J ~ 


such  that 


I.  c I ' , for  all  j , 1 < 0 < b 
J «} 


(M 


Since  Vigl,  |f(i)|<t,  there  will  exist  at  least  one  integer  J 
inich  Llmt  1 < ,1  < b and  f(i)  • /fence  partitions  of  I natisfyinf; 

(4)  will  exist.  In  particular  we  can  take 


^1  = ^l’ 


(5) 


However,  it  will  be  more  desirable  to  have  a partition  TT  which  satisfies 
(4)  and  also  satisfies  the  equicardinality  property 


(6) 


We  now  show  that  if  TT  is  an  ordered  partition  satisfying  (4)  , then 
the  pair  (TT , 6)  is  a local  structuring  for  the  pair  (F  , P(A)  ) . Let 
A’  c A . Clearly  I(A' ) dU  • Conversely,  let  i € l(A’)  . 


j€JA* 


Let  J be  such  that  i g I.  . Then  i 6 I.  (A*)  , Also  A'  c f(i)  c A.  . 
° ^o 

Therefore  j € J.  i snd  i € U Ij(A')  . Hence  l(A' ) c U I. (A*) 

OA  -i/rT  " 

and  the  defining  property  (2)  holds  for  (TT  , 6)  . 


A' 


Let  F.  denote  the  restriction  of  f to  I.,l<j<b.  Let 
J *f 

F.  = (l.  , A.  , F.)  , 1 < j < b be  the  local  files.  Let  M.  be  a set  of 
computer  memory  locations  reserved  for  the  file  ® j 

and  r.  respectively  denote  storage  rules  and  retrieval  rules  for  the 

u 

file  F . , 1 < j < b . Dien  the  storage  rule  s and  the  retrieval  rule  r 

J — •» 


--  — 


■I?’  ~ • 
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for  the  file  F is  defined  by 


s(i)  = s (i)  if  iei  ,l<J<b. 
J J 


r(A')  = U r (A’)  , V A’  (E  P(A) 


We  note  that  if  the  local  storage  rules  s . do  not  admit  any  redundancy, 

0 


the  storage  rule  s also  has  no  redundancy.  The  following  is  a sche- 
matic representation  of  the  local  structuring  constructed  above. 


Sets  of  items 

Computer  memory 

Storage  rule 

Retrieval 

Blocks 

of  the  local 

locations  for 

of  the  local 

rule  of  the 

files 

V the  local  files 

file 

local  files 

"1 

^1 

"1 

^1 

• 

^2 

• 

Mg 

# 

®2 

• 

^2 

• 

• 

• 

• 

• 

• 

• 

• 

• 

% 

• 

• 

^b 

Note  that  any  particular  item  is  stored  in  exactly  one  local  file.  For 


an  item  i , we  need  to  determine  an  integer  J for  which  A.  o ^"(1)  • 

*^o 


Then  the  item  i will  be  stored  in  the  local  file  F.  according  to  the 

Jo 


rule  s . . To  retrieve  for  a query  A'  , the  computer  program  will  first 
Jo 


scan  the  list  of  blocks  and  determine  the  set  J^,  of  integers  j for 


which  the  block  A.  contains  A*  . Once  the  set  J. , is  determined, 
J " 


the  files  F.  with  J $ Ja«  can  be  ignored  for  the  purpose  of  retrieving 
0 'A 

for  the  query  A'  . The  program  will  find  out  the  items  satisfying  A' 


in  the  file  Fj  for  each  j 6 and  take  the  union  of  these  sets  of 
items. 


..4.. 


IV 
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6,  Retrieval  efficiency  of  locally  structured  information  systems. 

In  this  section  we  develop  some  expressions  for  retrieval  times  required 
for  various  queries  in  a locally  structured  information  system.  These 
expressions  are  based  on  some  simplifying  assumptions  which  can  only  be 
an  approximation  to  tlie  very  comjilex  situation  for  a real  file,  Tlu; 
discussion  in  this  section  should  be  viewed  only  as  an  approach  for  com- 
puting the  retrieval  efficiency  of  a locally  structured  information 
system.  However  it  is  hoped  that  the  methods  developed  in  this  section 
will  provide  a guide  for  comparing  the  retrieval  efficiencies  of  various 

V 

locally  structured  information  systems.  Let  N denote  the  set  of  natural 
numbers.  Let  F = (l  , A , f ) be  a file  with  . Cl  c P(A)  as  the  set  of 
queries  of  interest.  Let  TT  = (l^  , , ...,  I^)  and  6 = (A^  , A^  , A^) 

be  b-tuples  of  subsets  of  I and  A respectively  such  that  >l  = j ^ ) 
is  a local  structuring  for  (F  , d)  , Let  T (A*)  denote  the  amount  of 

A/ 

time  (i.e.  the  number  of  units  of  computational  time)  required  to  re- 
trieve all  items  for  the  query  A'  . T (A* ) consists  of  two  components 

Jv 

Tj^(A*  ) and  TgCA')  . T^(A' ) is  the  amount  of  time  required  to  deter- 
mine the  set  J^,  . T2(A'  ) is  the  amo-unt  of  time  required  to  search 

within  the  local  files  F.  for  o'  € J«t  . The  amount  of  computing  time 

J A 

required  to  check  whether  the  relation  A'  c A^  holds  or  not  will  depend 
on  the  cardinalities  (A'l  and  lAjI  . We  assume  that  there  exists  a 
function  a;  N X N -»  N such  that  ,a(|A'l  , lA.l  ) is  the  amount  of  time 
required  to  check  the  relation  A'  c Aj  for  all  j,l<j<b  . Under 


these  assiuaptions 


T^(A’)  = ^a(lA*l  , lAji) 
J=1 


r 
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We  assume  that  the  sum  K a(IA'l  , |A.I  ) is  a function  of  the  cardi- 

b j=l 

nalities  lA'I  and  Z lA.I  , Let  X : N x N N be  a function  such 


0=1 


that 


T^(A')  = x((A’l  , ^lA.l)  (8). 

j=l 

For  the  trivial  local  structuring  , the  component  T^(A' ) is  equal 
to  o . The  function  X is  assumed  to  have  the  property  that  x(r  , v)  = o , 

for  all  positive  integers  r . Clearly  the  time  required  to  retrieve  for 

a query  A'  in  a file  F = (l  , A , f ) will  depend  on  lA'l  , lAl  and  the 
number  of  items  in  the  file.  Let  0:NxNxN-»N  be  a function  such 
that  0(IA' 1,111, |A|  ) is  the  amount  of  time  required  to  retrieve  for 

a query  A*  . Under  these  assumptions  we  get 

T2(A')  = ^0^|A^,liy,  lAl)  , V A'  e (7  (9). 

- - 1 ^ 

Let  n = II  I,  n.  = 1 1 .1  , k . = 1 A.I  , n = ^ E n.  and  k = ^ E k , 

0 0 0 0 0 j=l  ^ 

1 < j < b . Let  r < t be  a positive  integer  and  P^(A)  denote  the 

set  of  subsets  of  A with  cardinality  equal  to  r , For  an  integer  r , 

T (r)  will  denote  the  average  retrieval  time  for  a query  A'  with 
I 

lA’I  = r in  the  local  structuring  I . Combining  the  expressions  (8) 
and  (9)  we  get 

T^(A’)  = x(lA’l,bk)  + ^«>(lA’l,  nj,k^ 


d€J 


A' 


W A'€P  (A)  d=lA’eP^(A) 


r' 

A’cA. 

0 


= X(r,bk)  +7—  y 0(r,nj,kj)  /k,^ 


C)  A 


.(10) 
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If  the  local  structurinc  I satisfies  the  equi cardinality  property 
and  tlic  uniformity  property  i.e.  if  = , . . = = n and 

= ...  - k^  > then  wc  get 


T (r)  = X(r,bk)  + 0(r,n,k) 

' ( 


vN 

r/ 


(11)  . 


If  the  (v,k,t,h)  - configuration  is  unifora  and  the  local  structuring 
i does  not  necessarily  satisfy  the  equicardinality  property,  then  the 
formula  (lO)  reduces  to 


.(12)  . 


T,(r)  = X(r,hk)  + V 0(r,n  ,k) 

* (i)  k ' 

h 

It  can  he  expected  that  T.  0(r,n.,k)  vill  take  the  smallest  possible 

j=l  ^ 

value  when  all  the  n.'s  are  equal,  1 < 0 < 1>  • Therefore  it  will  be 
desirable  to  make  sure  that  the  local  structuring  I satisfies  the  equi- 
cardinality property  as  closely  as  possible.  The  formula  (U)  suggests 
that  for  fixed  v,  k and  t , the  retrieval  time  T (r)  takes  the  smallest 

X/ 

possible  value  when  the  number  of  blocks  b is  smallest  possible.  There- 
fore for  fixed  v,  k and  t , one  should  look  for  a (v,k,t,b)  - configuration 
which  minimizes  the  number  of  blocks.  For  the  trivial  local  structuring 


(U)  reduces  to 


T,  (r)  = 0(r,n,v) 


.(13)  . 


Let  uj(A' ) be  a nonnegative  real  number  satisfying 


0 < u,(A’  ) < 1 

and  y <o(A’ ) = 1 

A'6i7 

where  cf  is  the  set  of  queries  of  interest.  The  number  u)(A' ) is  the 
weight  attached  to  the  query  A'  and  measures  the  relative  importance 


Ih 


of  the  query  A'  . The  retrieval  parameter  T(a)  for  the  local  struc- 
turing 1 (with  respect  to  the  weight  system  u)(A' ) , k' ^ (7)  can  be 
defined  by 

T(a)  = ^ T^(A’)  U)(A’) 

A'e  £7 

7.  Locally  structured  inverted  filing  system.  Let  F = (l  , A , f ) 

be  a file^  TT  = , I2  , I^)  be  an  ordered  partition  of  I , 

6 = (A^  , A^  , ...y  Aj^)  be  a b-tuple  of  subsets  of  A , A = (TT  , 6)  be 

a local  structuring  and  P.  = (l.,A.,f.),l<j  <b  be  the  local  files 

J J J J — 

of  F . Let  s . and  r . be  respectively  storage  rules  and  retrieval 
0 0 

lailes  of  an  inverted  filing  system  for  F^  . Then  we  define 

s(i)  = s^(i)  , if  i 6 
b 

and  r(A')  = U r .(A’ ) , V A’g  c7 

where  C7  is  the  set  of  queries  of  interest.  T2ie  locally  structuring  I 
together  with  the  storage  rule  s and  retrieval  i-ule  r will  be  called 
a locally  structured  inverted  filing  system.  We  now  derive  expressions 
for  the  retrieval  parameter  T.Cr)  for  the  locally  structured  inverted 
filing  system.  First  we  consider  the  trivial  local  structuring  . 

For  a query  A*  = {a^  , a^)  the  retrieval  time  TgCA*)  can  be 

split  up  into  two  components  ^^(A' ) and  T^(A' ) , 
amount  of  time  required  to  identify  the  buckets  corresponding  to  the 
attributes  belonging  to  A*  . Let  d be  the  amount  of  time  required 
by  the  computer  to  check  the  equality  of  two  attributes.  We  assume  that 
there  exists  a function  cp  ; N x N -♦  N such  that  T^CA' ) = dcp(lA'l  jIAl)  , 
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Tr(A')  is  the  amount  of  time  required  to  form  the  intersection  D M 
^ e.fA’ 

vhere  is  the  bucket  corresponding  to  the  attribute  a . Let  c be 

the  amount  of  time  required  to  test  the  equality  of  two  accession  num- 
bers, We  postulate  that  there  exists  a function  ^i;NxNxN-♦N 
such  that 

Tj^(A')  = c 0(r,n,v)  (l*t)  . 

Under  these  assumptions,  ve  get 


T (r)  = dcp(r,v)  + co(r,n,v) 


.(15)  . 


Similarly  for  a locally  structured  inverted  filing  system  £ with  the 


equicardinality  property  and  the  uniformity  property  we  get 

.0 


T^(r)  = X(r,bk)  + 


^dcp(r,k)  + c6(r,^^,k)^ 


(16). 


(17)  . 


If  we  further  make  the  simplifying  assumption  that  X(r,bk)  = dq>(r,b)  , 
we  get 

Tj^(r)  = d(^9(r,b)  + cp(r,k))+  c 0(r  , g,  k) 

\r/  \r/ 

8.  Lexicographic  storage  and  retrieval  rules  in  an  inverted 
filing  system.  Consider  an  inverted  filing  system  for  a file  F = (l  , A , f) 
with  A = [a^,  a2,  ...,  a^)  . Let  be  the  bucket  corresponding  to 
the  attribute  a^  , 1 < i < v . Let  a(ra)  denote  the  accession  number 
stored  in  the  memory  location  m . Let  M’  = (m^^ , m^  , ...,  m^)  be  a 
bucket  wncre  m^  , m2  > . . . , is  a natural  ordering  of  the  memory 

locations  in  m'  . The  lexicographic  storage  rule  requires  that 
a(m^)  < a(m2)  < ...  < a(mj^)  . Consider  two  buckets  = (m^  , m^2  ^ •••> 
and  y»2  = ^"^1  > “22  ’ where  a(m^^)  < a(ra^)  < ...  < a(m^p)  , 


9 
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and  a(n2j^)  < a(ni22)  < ...  < ^(^2q)  • To  form  the  intersection  n 1'^  , 

the  program  starts  with  a(m^)  , compares  it  with  a(m2^)  , a(m22)  etc. 

and  finds  the  smallest  integer  j such  that  1 < J < Q.  end  a(m^)  = e(m2j)  . 

If  no  such  integer  exists, then  clearly  fl  ^^  = 0 . Suppose  such  an 

integer  j exists.  Let  N*  = - Ca(m^))  and  Ng  = Mg  " ’ •.•jeCnigj)]  . 

Then  it  is  easily  seen  that 

n = {a(m^))  U (N*  n N*) 


We  now  see  that  the  number  of  comparisons  of  accession  numbers  to  compute 

n will  be  Min(livI^|^|M2l)  . If  lM^I  = lK2l^  then  the  number  is  equal 

* ^ ^ 
to  the  common  cardinality.  Similarly  to  compute  the  set  0 Kg  n , 

the  required  n\amber  of  comparisons  of  accession  numbers  will  be 

MindM^UM*!)  + MindK^nMgl,  1m*1)  . If  we  assume  that  all  attributes 

are  equally  likely  i.e. 


*.  n 


, V A’e  P^(A)  , 


C) 

n = III  , 

then  the  number  of  comparisons  of  accession  numbers  required  to  compute 


n M.  will  be 
a^A*  ^ 


JL  + JL.  + + Jl. 

(I)  Q)  ■■■  W ■ 


Therefore,  for  A'g  P (A)  , r > 1 , Tl  (A' ) , the  time  required  to  compute 

cn 

the  intersection  n M will  be  given  by  E where  c is  the 

a€A'  i=l  [1) 

amount  of  time  required  to  compare  two  accession  numbers.  In  other  words 
as  a first  approximation  we  may  assume  that  the  function  0:NxNxN-»N 
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is  given  by 


x-1 


e(x,y,»)  = X ^ 


if  X > 1 

(18) 

and  0(x,y,z)  =0  , if  x = 1 

Next  we  analyze  the  amount  of  time  required  to  check  if  a relation 
A'  c a"  holds  where  A'  , A"  c A , Suppose  the  attributes  are  assigned 
numbers.  The  attributes  belonging  to  a subset  A'  of  A will  be  stored 
in  ascending  order.  If  the  rules  of  lexicographic  search  are  applied, 
then  the  number  of  comparisons  of  attribute  numbers  required  to  test 
A'  c A"  will  be  min(|A'l , I A"| ) . Therefore  the  time  required  to  test 
A'  c a"  will  be  d min(lA'l , I A"|  ) where  d is  the  amount  of  time  re- 
quired to  compare  two  attribute  numbers.  In  other  words  the  function 
9 : N X N -»  N introduced  in  the  last  section  can  be  assumed  to  be 

cp(x,y)  = xy  (19)  . 

Under  these  simplifying  assumptions  for  r > 1 the  formulas  (15)  and 
(17 ) reduce  to 


r-1 


v-.— I I 


(20) 


m r-1  ^kN 

T fr)  = d(rb  + rkb  ^)  + V 7— 

‘ 0 A C)£) 

9.  Boolean  local  structuring.  Let  v>k>t>0  be  integers. 

Let  A be  a set  of  v attributes  and  Pj^(A)  = (A^  , Ag  , ,,,,  A^),  b = 
be  the  set  of  k - element  subsets  of  A . Clearly  the  blocks  Aj^ , Ag  , . . . , A^^ 
define  a (v,k,t,b)  - configuration.  A local  structuring  I based  on  this 


,-Al 
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trivial  (v,k,t,b)  - conficuration  will  be  called  a Boolean  (v,k,t)  - local 
structuring.  V/e  assume  that  the  local  structuring  I has  the  equicardi- 
nality  property  and  that  an  inverted  filing  system  with  lexicographic 
storage  and  retrieval  rule  is  used  for  each  local  file.  The  retrieval 
parameters  T (r)  and  T (r)  are  given  by 

A Aq 

T.  (1)  = dv 


T (i)  . 


V”  ■ "MS' 


r-1 


r > 1 
r-1 


T/r)  = d(r(p  + ^ 


cn^7' 


i=l 


, r > 1 


Clearly  T (l)  < T (l)  and  the  trivial  local  structuring  is  better  than 
Aq  ^ I 

any  other  local  structuring  for  retrieval  of  queries  of  size  1 . 


If  r > 1 and  1 < i < r-1 , 

r-1 


© 1 


Therefore 


I 7^ 


■ (I)  C;i)  G)  ■ 

and  for  fixed  v, k , r , c and 


4 0©  4 G) 

d and  sufficiently  large  n (i.e.  the  number  of  items  in  the  file) 

T (r)  < T (r)  . In  fact  as  the  size  of  a file  grows,  the  size  of  the 
A Aq 

accession  numbers  grow  and  the  constant  c will  tend  to  become  larger. 
As  an  illustration  consider  the  case  k=t  = v-  l.  Then  we  get  the 


V) 


expressions 


T,  (1) 


dv 


T,(l)  “ d(v  + (v-1)') 

I 

T-X 

T (r)  = drv  + ^ 


r > 1 


r-1 


G) 

T^(r)  = d(rv  + r(v-l)^)  + ^ ^ , r > 1 . 

i=l  \i} 


cn 


T (2)  = drv  + ^ 

° \ 


cn(v~2) 


If  n > 


T^(2)  = d(2v  + 2(v-l)^)  ^ 

2dv(v-l)j  ^ ^2)  > T^(2)  . 


10.  Projective  local  structurinf;.  Let  m te  a positive  integer 
and  q "be  a prime  power.  Let  GF(q)  denote  the  Galois  field  of  order  q , 
Let  PG(tn-1  , q)  be  the  (m-1 ) - dimensional  projective  space  over  GF(q)  . 
The  number  of  ( 1-1 )-  flats  of  PG(m-1,q)  is  given  by  the  function 
0(m,l,q)  where  0(m,l,q)  = -^ — ~ -p — 1 5 1 < 1<  m; 

(q-  i)(q  -1  ) ...  (q--!  ) 

and  0(m,o,q)  = 1 . Let  1 be  a fixed  integer  satisfying  1 < 1<  m . 
Let  v = 0(m,  1,q)  , k = 0(1, 1,q)  , b = 0(m,l,q)  and  t be  a positive 
integer  satisfying  t <1.  Let  A be  the  set  of  points  of  PG(m-1,q)  . 
Corresponding  to  the  b (1-1)  flats  of  PG(m-1,q)  , we  take  blocks  , 

Ag  , . . . , \ . The  ith  block  A^,  contains  the  points  belonging  to  the 
ith  (1-1  ) flat,i  < i < . Then  for  1 < i < h , lAj  = k . A set  of 

t - points  will  determine  a (c-l)-flat  where  c < t . Let  A’  be  a 


J’O 


set  of  t - points  lying  in  a c - flat.  Then  the  number  of  blocks  (or 
(l-O-flats)  containing  A'  will  be  given  by  0(m-c,l-c,q)  . Since 
m-c  > 1-c  > o , 0(m-c,l-c,q)  > o . Therefore  the  blocks  , . . . , A^ 

define  a (v,k,t,b)  - configuration  and  the  corresponding  local  structuring 
is  called  a projective  local  structuring.  For  computational 
realize  the  flats  of  PG(m-l,q)  as  subspaces  of  an  m-dimensional  vector 
space.  Let  V be  the  vectorspace  of  m-tuples  x = (x^  , ^2  > 
of  elements  of  GF(q)  . Let  = 0 , = 1 , , . . . > ^ be  the  q 

elements  of  K = GF(q)  . An  (1-1 ) flat  of  PG(m-i,q)  will  be  represented 
by  an  1-dimensional  subspace  of  V . In  particular  a o-flat  or  a point 
of  PG(m-l,q)  will  be  represented  by  a 1 -dimensional  subspace  of  V . 

A 1 -dimensional  subspace  of  V will  be  represented  by  a nonnull 

vector  belonging  to  . Let  be  a set  of  vectors  of  V defined  by 

U.  = {(0,0,.., 0,1,  3^  , Pg  , I Pj  € K , 1 < j < m-i|  . 

Let  U = U Ug  U . . . U . It  is  easily  seen  that  any  two  vectors  of 
U are  linearly  independent.  The  number  of  vectors  of  U is 

+ q™”^  + ...  + q°  = 0(ta,o,q)  . The  0(m,o, q)  projective  points 
(or  1 -dimensional  subspaces  of  V)  are  conveniently  represented  by 
vectors  belonging  to  U . In  other  words  in  the  projective  local  struc- 
turing every  attribute  will  be  identified  with  a vector  belonging  to  U . 
The  blocks  or  the  (1-1  ) - dimensional  subspaces  for  1 > o are  more 
conveniently  represented  by  matrices.  Let  B be  an  (m-lxm)  matrix 
with  entries  from  F and  rank  equal  to  m-l  . Let  o be  the  (m  x 1 ) 
matrix  with  all  entries  equal  to  o . The  set  of  vectors  x which 
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satisfy  the  matrix  equation 

T 

B “ £ 

defines  an  1 - dimensional  subspacc.  Every  block  and  the  corresponding 
local  file  will  be  identified  by  a matrix  B . Let  A'  = x^^^, 

X '>  be  a set  of  r attributes.  Let  be  a block  given  by 

the  equation  Bx  = o . To  test  if  A*  c A^^  , we  need  to  check  if  Bx 
is  equal  to  o or  not  for  j = Thus  for  projective  local 

structuring,  the  amount  of  time  required  to  identify  the  local  files 
relevant  for  a given  query  will  probably  be  less  than  the  same  for  Boolean 
local  structuring. 

11.  Euclidean  local  structuring.  Let  m be  a positive  integer,  q 

be  a prime  power  and  K = GF(q)  be  the  Galois  field  of  order  q . Let 

^ be  an  (m-1 ) dimensional  projective  space  over  GF(q)  . Let  2jjj_2 

be  an  (m-2)  flat.  The  Euclidean  space  EG(m-1,q)  is  the  space  of 

consisting  of  the  1 - flats  of  E . which  are  not  contained  in  E n, 

ra-1  m-c 

1 = 0,1  , . , . , m-1  . Let  0(m,l,q)  be  the  function  introduced  in  the 
last  section.  The  number  of  (1-1 )-  flats  of  EG(m-1,q)  is  given  by 
0(m,l,q)  - 0(m-1,l,q)  . Let  A be  the  set  of  o-flats  of  EG(m-1,q)  . 

Let  V = 0(m,1,q)  - 0(m-1,1,q)  , k = 0(1, 1,q)-  0(1-1, 1,q)  , b = 0(m,l,q)  -0(m-l,l,q) 
and  t be  a positive  integer  satisfying  t < 1 . Corresponding  to 
the  b (1-1  )- flats  we  take  b blocks  A^  > A^^  . The  ith  block 

contains  the  elements  of  A corresponding  to  the  o-flats  contained 
in  the  ith  (1-1 ) - flat.  It  is  easily  seen  that  the  blocks  A^  y k^,  ...,  A^ 
satisfy  the  properties  of  a (v,k,t,b)  - configuration.  The  corresponding 
local  structuring  is  called  an  Euclidean  local  structuring.  Let  V be 
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the  (n-1 ) - din'.onsional  vector  space  of  (m-1  )- tuples  (x^  > ^ 

of  elements  of  GF(q)  , For  computational  purposes,  the  o-flats  of 

EG(m-l,q)  can  be  token  to  be  the  vectors  of  V . Let  B and  a be 

respectively  ^(m-l)  x (ro-1 and  ^(ni-l)  y matrices  with  elements 

from  K such  that  the  rank  of  B is  (m-l)  . Then  the  set  of  vectors 

X = (x^  > ••• > ^ ) which  satisfy  the  matrix  equation 

T 

B X = a 

will  be  an  (l-l ) - flat  of  EG(m-1,q)  . Therefore  each  local  file  of 
the  Euclidean  local  structuring  will  be  identified  by  a pair  of  matrices 
(B,a)  . . 

12.  (v,k,t)  - designs.  Let  v>k>t>o,X  and  b be  positive 

integers.  Let  A be  a set  of  v elements.  Let  ...,  be 

the  blocks  of  a (v,k,t,b)  - configuration.  If  every  t - element  subset 
of  A is  contained  in  exactly  X blocks,  then  the  (v,k,t,b)  - config- 
uration is  called  a (v,k,t,X)  - design.  (v,k,2,X)  - designs  are  also 
called  balanced  incomplete  block  designs  (bibd)  . It  is  easily  seen 
that  (v,k,t,X)  - designs  are  optimum  (v,k,t,b)  - configurations.  The 
problem  of  constructing  (v,k,t,X)  - designs  and  more  generally  optimum 
(v,k,t,b)  - configurations  is  extremely  difficult.  If  k = t , then  the 
class  of  k - element  subsets  is  a (v,k,k,1 )- design.  Except  for 

this  trivial  (v,k,k,1 )- design,  no  (v,k,t,i )- design  with  t > 5 is 

known.  For  a file  F = (l  , A,  f ) occurring  in  practice,  Max  lf(i)|  will 

i€I 

be  greater  than  5 . Therefore  the  known  (v,k,t,1 )- designs  will  not 
be  very  useful  for  constructing  local  structuring.  The  following  table 
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There  is  a large  literature  on  bibd  . Since  the  knovm  (v,k,t,l)- 
designs  will  not  be  of  much  practical  interest  for  constructing  local 
structuring,  we  do  not  discuss  the  details  of  construction  of  these 
designs.  However  we  like  to  emphasize  the  point  that  the  theory  of 
(v,k,t,1 )- designs  for  large  t v/ill  be  vitally  important  for  constructing 
efficient  local  structuring. 


Let  v>k>t>o  be  integers.  Let  A be  a v-element  set  and 

R = {A  , Ap  , A,  , ) be  such  that  A.  c A and  lA.l  < k , 1 < i < • 

1*-  **" 

For  A'  c A , define 

c(A',|[J)  = I {i:  lA’  n A^l  > t , 1 < 1 < b ) I 

Let  C = (A,R)  be  a (v,k,t,b)  - configuration  with  JJ  = {A^  jA^,  A^)  . 

Let  = {A^  ,A2>  , A^)  , 1 < i < b . The  configuration  C is  said 

to  be  locally  optimum  iff  for  V i,2<  i<b,A*^  ^ • 

Clearly  a (v,k,t,1 )- design  is  a locally  optimum  (v,k,t,b)  - configuration. 

After  choosing  the  first  i-1  blocks,  we  try  to  choose  the  ith  block 
such  that  the  number  of  new  t element  subsets  covered  by  the  ith  block 
is  as  large  as  possible. 

l4.  Multistage  local  structuring.  Let  F = (l  , A , f ) be  a file 

such  that  IaI  = v , |I|  = n and  Max  |f(i)|  = t . Let  j8  = (TT  , 6)  be  a 

ifl 

local  struct\iring  of  F with  TT=(l,  Ig,...,  ) and 

’ 1 

* = (A,,A  , ...,A^  ).  Let  MaxIAl  = k and  F = (l  ,A  ,f  ) , 


1 < J < b be  the  local  files.  If  k > t , and  each  |I .1  is  large,  we 

1 It) 


the  principle  of  local  structuring.  A 2-stage  local  stnacturing  L of 
the  file  F is  a (b^+l) -tuple  {i,  ) vhere  A is  a local 

structuring  of  F with  local  files  F , F,  and  l is  u local 

structuring  of  the  file  F.  , 1 < j < b . Let  A..  = (tT.  , 6 .)  , 1 < 0 < b 

Suppose  - (I.^,  Ijp,  and  6,  . (A.^ , , . . . , 

1 < j < b^  , Let  F^^  = f^^)  where  f^^  is  the  restriction  > 


of  fj  to  Iji  i 1 < J < ^ 1 < 1 < b^  . 


Let  s^^  and  r^^  respec- 


tively denote  storage  and  retrieval  rules  for  the  local  files  F._  , 

J 

1 < J < b^  > 1 ^ 1 £ ^2  * A'  c A let 

Ja*  = {(j  > 1);  1 < J 1 i 1 < 1 < ^2  ’ ^jl  2 
We  can  define  a storage  rule  s and  retrieval  rule  r for  the  file  F 
by  setting 

s(i)  = , if  i e 

and  r(A')  = u r (A‘ ) 

To  get  an  expression  for  (r)  , the  average  retrieval  time  required  for 

Xi 

a query  of  size  r , we  assume  that  |A.I  =k  and  lA.-l  =kp  for  1 < J £b 
and  1 < 1 < bp  . The  subsets  A.  and  A_  are  respectively  called  first 
stage  and  second  stage  blocks,  1 £ j < b^  1 £ ^ £ ^2  * ®lso  assume 
that  for  each  second  stage  local  file  F^^^  inverted  filing  system  with 
lexicographic  storage  and  retrieval  rules  are  used.  As  before  let  d'  and 
c respectively  denote  the  amount  of  time  required  to  compare  a pair  of 
attribute  numbers  and  a pair  of  accession  numbers.  Let  Vp(A' ) denote 
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the  number  of  pth  stage  blocks  containing  A'  , p = 1 , 2 . We  assume 


“ b^b^ 


To  retrieve  for  a query  A'  of  size  r the  computer  viU  take  d'b^r 
units  of  time  to  check  the  relations,  A'cA.  1 £ J < ^ each 

of  the  first  stage  block  A.  containing  A'  , the  computer  will  take 

t) 

(fb^r  units  of  time  to  determine  the  second  stage  blocks  which  contain 

A’  . Finally  for  each  second  stage  block  A , the  comparison  of  acces- 

n 1 ^ 

sion  ntmbers  will  require  r— r—  Z rr-x  units  of  time.  Therefore  for 

h'’2  1=1  R 


r > 1 we  get  the  expression 


Tj^(A')  - ♦ v,(A')rl.p  * (s.^(A’)  ^ ^ 

^■'Vj 

Let  d > 1 be  a positive  integer.  A d-stage  local  structuring  L for 
“ d-1 

the  file  F is  a (1  + b ,bp. . .b  ) - tuple  f A,  A > 1 < P < d-1  > 1 < j_  < b ) 

pT=1  \ ^ V N ^2****^p  ^ 

where  b^  ^ "bg  , b^  1 ' ^d  positive  integers,  I = (TT,6)  is  a local 

structuring  of  F , TT  = (l^  , Ig  , . . . , ) , t = (A^  , A^  , . . . , ) .. 

i.  i = (TT.  .6  i ) > ^ i = (^i  i i b > > 

0^...3p  3^..*Jp>  J^***dp  ^r^'-Tp  •’r**‘TpT  'T^..*3p 

6.  ^ = (A  A.  , ) > A . is  a local  structuring 

0^...3p  J,--*JpT  ^r**'’p  P+1  *^r“^p 

of  the  pth  stage  local  file  F.  , (l , i > . > T*,  . ) i 

J^...3p  J^***Jp  <J-|‘**dp  'Ji***'Jp 

F . . is  a local  file  of  4 , . , 1 < p < d-1  , 1 < d < b . The 

j^...3p  ^r-*Vi  ■ " - P - P 


P - P 


subsets  A.  . . are  called  the  pth  stage  blocks.  In  our  defi- 

nition,  we  require  that  the  number  of  pth  stage  blocks  for  each  local 
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structurinc  X.  . is  a constant  number.  This  assumption  is,  of 


course  not  necessai’y.  For  the  sake  of  simplicity  we  assume  that  the 


cardinality  of  each  pth  stage  block  is  > ''  < P < d . Suppose 
inverted  filing  systems  with  lexicographic  storage  and  retrieval  rules 
are  used  for  the  dth  stage  local  file.  Further  v^e  assume  that 


n,  ll,  , |=  , ll.  . (A’)l  = 


^••^dT^d 


for 


a'  ^ Py(A)  . Let  '^p(A' ) denote  the  number  of  pth  stage  blocks  which 
contain  A*  . Arguing  as  in  the  case  of  2-stage  local  structuring, 
we  get  \ 


d-1 


VA')  .d’(rb,  t I Vj.(A')  V,  5:^  i ^ 


(22) 


15«  Multistage  boolean  local  structuring.  Suppose  the  first  stage 
local  structuring  I is  a (v,k^  =v-i,t)  Boolean  local  structuring  and 
each  second  stage  local  structuring  is  a (v-1  ,kg  = v-2,t)  - Boolean  local 
structuring.  Suppose  in  general  each  pth  stage  local  structuring  is 
a (v-p+1,k  =v-p  ,t)-  Boolean  local  structuring  with  p = 1 ,2,..(v-t). 
Then  we  have  k^  = v-i  , kg  = v-2  , and  k^_^  = t . Also  for  A'  f Py(A)  , 
r < t,  v^(a')  =v-r,V2(A')  = (v-r)  (v-i-r),. ..  Vp(A' ) = (v-r)  (v-1-r)... 
(v+1-p-r)  , p = 1 , 2 , ...  (v-t) 

and  b^  = V , bp  = v-1  , . . . b^_^  = t + 1 . 

The  expression  (22)  simplifies  to 
v-t-1 

Tj^(r)  = d'r  ^ (v-r)  (v  - 1 - r)...(v+ 1- p - r)  (v-p)^  + 


P=1 


r-1 


(23) 
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Conparinc  (20)  and  (23)  , we  get 

VO  - T^^(0 . a(.„,v,t) . =.  I - 0^ 

where  a(d,r,v,t)  is  a function  of  d,r,v  and  t . If  '*  < i < r-1  , 

(y  ~ ~ --  ~ -^-1  < 1 . Therefore,  for  fixed  c,d,r,v,t,  r > 1 

and  sufficiently  large  n,  T (r)  will  bo  smaller  than  T (r)  . Multi- 

L 

stage  projective  and  Euclidean  local  structurings  can  be  developed  in  a 
manner  similar  to  that  of  multistage  Boolean  Local  structuring. 


f • 


